NetNews Offline 2

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Offline 2 / NetNews Offline Volume 2.iso / news / comp / lang / c-part1 / 7991 < prev next >

Wrap

Internet Message Format | 1996-08-05 | 3.6 KB

Path: anvil.ugrad.cs.ubc.ca!not-for-mail From: c2a192@ugrad.cs.ubc.ca (Kazimir Kylheku) Newsgroups: comp.lang.c Subject: Re: [Perf:] mem*() procs vs. array looping Date: 28 Feb 1996 12:24:29 -0800 Organization: Computer Science, University of B.C., Vancouver, B.C., Canada Message-ID: <4h2dltINNlag@anvil.ugrad.cs.ubc.ca> References: <4glkq1$gu7@gazette.tandem.com> <4h1n14$3b3@news.interpath.net> NNTP-Posting-Host: anvil.ugrad.cs.ubc.ca In article <4h1n14$3b3@news.interpath.net>, Scott McMahan - Softbase Systems <softbase@mercury.interpath.net> wrote: >Francis E. Chang (francis@patch.tandem.com) wrote: > >: Are mem*() procedures performance boosters? > >I have had some experience with this exact question, and want to >share what I found out. > >1. The only real answer is, "it depends". It depends on who wrote the >stdlib routines, how conscientious they were, how much they knew about >the hardware they were writing on, etc. From one library to the >next, the answer could change. What makes you think that memcpy() and friends are necessarily function calls to library routines? Since they are standard defined functions, the compiler has the license to generate inline code for a reference to them, provided that they have not been redefined as internal objects (that is, static functions) inside the translation unit in the scope of the reference, and you are using a hosted C environment. You can do such redefinition of a standard function such as memcpy() if you don't #include the header which declares it as an extern object, and if you write it as a static. In that case, the compiler will generate code that calls _your_ memcpy() rather than make inline code which implements the standard memcpy(). >2. The only way to tell is on a per-stdlib basis! Measuring execution >of the program in a proflier for every different memxxx in every >different library. You *have* to profile the program and see what >is going on. Does calling memxxx even matter in the overall scheme? You don't have to profile. This is far too simple to require profiling. Timing a large number of memcpy() operations should be quite telling. >3. Most if not all commercial compilers will write things like memcpy >in assembly language, and some processors have native instructions for >doing memory copies and stuff that make them much faster than any C >code you could write, because they don't have to continually load >addresses like you do in a loop. Not only that, but some machines have special co-processors for doing these kinds of blits (e.g. Amiga, some Suns). >4. I had a program where memset was THE bottleneck, taking up more time >than I/O. I re-wrote a memory zeroing function using the most unrolled, >efficient loop I could write, and it was still orders of magnitude >slower than the system memset. No way I could make it faster in C. The compiler may have generated an in-line code utilizing the best idiom for the architecture. Wins hands down, unless you spoil the standard compliance of your code with your own explicit inline assembly language that is even better. Generally not worth it. >5. WRT #4, I removed the memset. It was initializing a buffer, and I >said "forget it" and used the uninitialized buffer and took my chances. >Changing the design was more effecient than coding hokey >optimizations! After removing memset, I/O functions accounted for >90% or more of the time spent in the program, which I could live with. Did you assume anything about the uninitialized contents? That is never a good thing for auto or dynamically allocated storage. Though if you need the speed, you can always try sacrificing portability. >Scott > --